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An Asymptotic Expansion for the Multivariate 
Normal Distribution and Mills 7 Ratio 



Harold Ruben* 

(September 20, 1963) 

An asymptotic expansion for the multivariate normal integral over an infinitely extended rectangle, 
and therefore also for the associated multivariate Mills' ratio, is developed. The expansion is valid 
provided the vertex of the rectangle lies in a polyhedral half-cone determined by the set of regression 
planes. 

The expansion obtained here is a natural generalization of the classic expansion for the normal 
univariate integral, and the coefficients in it involve the moments of the conjugate multinormal 
distribution. 



1. Introduction 



Defir 



y(x,M) = (27r)- /j/2 |M| 1 / 2 e- x Mx72 (L1 ) 

F(a,M)=f f(x,M)dx (1.2) 

and 

R(a,M)=F(a,M)//(a,M). (1.3) 

Thus,/(x, M) is the probability density of a random normal vector X = (X\, . . . , X n ) with expec- 
tation vector zero and nonsingular variance-covariance matrix M _1 ; F(a, M) is the probability that 
X ^ a for a fixed vector a = (ai, . . . , a n ), where X ^ a is to be interpreted as the set of simul- 
taneous inequalities X a = a a (ot=l, . . . , n); finally, /?(a, M) is (the n-variate) Mills' ratio for a 
multinormal distribution in the sense that it represents the ratio of the probability-mass in the 
infinitely extended Ai-dimensional rectangle x ^ a to the probability density at the vertex, a, of the 
rectangle. (This point is frequently referred to in this context as the "cutoff point.") Indeed, for 
the special case rt = l, with M— 1, we have the usual (univariate) Mills' ratio 

/?(a,l)= I*" Viry^e-^dxll&iry^e-" 2 ' 2 }' (1.4) 

J a 

In a recent paper in this journal, I. R. Savage l [14] 2 has obtained two useful and easily applied 
inequalities for /?(a, M), and therefore also for F(a, M), when aM > 0. We shall here obtain an 
asymptotic expansion for 7?(a, M) under the same conditions which produces a sequence of upper 
and lower bounds for 7?(a, M) and F(a, M). Savage's inequalities correspond to the first of these 
upper and lower bounds. 



*Department of Statistics, University of Minnesota. 

1 The notation adopted in this paper is that of Savage. It should be remarked that Savage's results and the results of the present paper were obtained inde- 
pendently via an essentially identical line of approach, and a slight degree of overlap in the current article relative to Savage's paper is therefore unavoidable. The 
present results may be regarded as forming a natural complement to, and extension of, Savage's results. 

* Figures in brackets indicate the literature references at the end of this paper. 



2. Derivation of the Expansion and Discussion of Its Properties 
On setting 



y = x-a 



(1.2) becomes 



fhere 



F(a, M)=/(a, M) • ( e-w-y™y'iWy, (2.1) 



A = aM; (2.2) 



that is, 



i,M) = J e~W- 



/?(a,M) = J e-w-y™y'i 2 dy (2.3) 

To obtain a series expansion for /?(a, M) from (2.3), expand exp (— yMy'/2) in its power series 
form round y = and, on the assumption that A > 0, integrate term by term. (The sense in which 
this purely formal procedure is justified is described below.) We note that yMy' is symmetric 
round y = 0, remaining unaffected when y is replaced by — y, so that 



e -yMy72 = y' 



OLi 



L! 



Inl 



where 

a h . . . /„ = (d''i + • • • +t Vdyi*i . . . ay,/'»)e-y M >" /2 | y =o 

and S'i . . . , i denotes summation over all nonnegative integral ii, . . . , i n such that U + . . .-\-i n 
is an even integer (including zero). Consider now a random normal vector X* = (Z*, . . . , X*) 
with expectation vector zero and variance-covariance matrix M: such a vector may be described 
as having a distribution conjugate to that of X. The coefficients a,-, . . . .. in the above series 
representation of exp (— yMy'/2) can then be expressed in terms of the moments of the conjugate 
distribution. For since the characteristic function of X* is given by 

E[e itx *'] = e- tm ' 12 , 

t denoting a fixed n-dimensional vector with real components, we have 

<*h •• in = ih+ ' '■ +l " Ml* ■■■«». 

where 

,*»*...,„ = £[(**)'. . . .(*>]. 

Hence (2.3) can be expressed in the form 

R(a,M)=r . . . T e-l^aVa V (- 1)«,+ ■ • • «J*tf . & . . V -^ dy, . . . dy n , 

Jo JO «!,... ,i„ ' ' ' ll! ln - 

and subsequent term by term integration yields the result that /?(a,M) can be formally identified 
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with the rc-variate series 



r °° i 

L j a = I J ta« 'a 



= S' (- D ( '» + • • ' + '» ) V? I . . . ,,, I! 7-TTT < A « > 0; a= 1, . . . , «). (2.5) 

i"i, . . . ,i n a=l a a 

Laplace's classic asymptotic expansion, a large, for the univariate Mills' ratio, as defined in (1.4), is 

RM „ I. { 1 _X + M_M5 + . . .j (a>0) (2 . 6) 

the symbol ~ denoting the fact that the series on its right is an asymptotic representation of the 
function on its left. If X is a (univariate) normal random variable with zero mean and unit standard 
deviation, then X* 9 the conjugate random variable, has the same distribution. The coefficient of 
l/a 2j in the bracketed expression in (2.6) is (— iy 1.3.5 . . . (2jf— 1), and this is the (2/)th moment of 
X* about the origin. From this point of view it is clear that (2.5) represents the multivariate analog 
of (2.6), since the fJL* . . , , n are the moments about the origin of the distribution of X*, i.e., of the 
distribution conjugate to that of X. However, the analogy is exhibited most clearly if (2.5) is ex- 
pressed in more explicit form as 

R(a,M)~ A * A ■ {C0-C1 + C2-. . .} (A o >0;a = l, . . . , n), (A) 

where 

q= x ;;-'" 5 0-0,1,...). (A.D 

(For a proof that the symbol ~ is justified in (A), see below.) Expansion (A) is also the form to be 
used for computing purposes. 



The first three Cj are given by 
Co=l, 



(2.7) 



r — / ^20 . . . 1 1 Mo . . . 02 \ 1 / Mno . . . 1 M1010 . . . 1 , Mo . . 

Ci_ ( a? + - • • +_ Ar _ rl"AiAr + "XAr h x^:j 

c 2 = (^H+. . .+^- A . 



. . 04 ) 
4 / 



M3IO ... I M3OIO ... I _[_ /V . . Q3 1 ■ M1 3Q. . . Q i Ml Q3Q. . ■ Q i M q , , , 13 \ 

A?A 2 ^ A*A 3 ^"'a^a/ A t A; ^ A, A* ^A n -iA*/ 



I / M220 ... I M2020 ... I 1 Mq . . . Q22 ) 

A 2 A 2 A 2 A 2 ' ' ' A 2 A 2 / 

^1 ^2 1 3 n-l n 7 



.* \ /,** 



■ / M2110. . . I 1 1 / MllllO ... 1 \ /r> n \ 

+ U?A 2 A 3 + - • •j + UiA 8 AaA 4 + - • 'J' (2 - 9) 



the different bracketed terms in C\ and C2 corresponding to the various partitionings of 2 and 4, 
respectively, into n nonnegative integers. (The last two bracketed terms in (2.9) vanish if n = 2, 



and the last bracketed term vanishes if n = S.) More generally, it is clear from (A.l) that the 
evaluation of Cj requires the enumeration of all possible partitionings of 2/ into n nonnegative 
integers. 

The expansion (A) for the multivariate Mills' ratio has some useful properties which are quite 
analogous to those of the corresponding expansion (2.6) for the univariate Mills' ratio. The proper- 
ties in question are as follows: 

(i) Expansion (A) is an enveloping one, in that /?(a, M) falls short of every summand with an odd 
number of terms and exceeds every summand with an even number of terms. 

(ii) The truncation error after any number of terms in (A) is numerically less than the first 
term omitted. 

(iii) Expansion (A) is asymptotic for large A a (as remarked previously after (A.l)). 

We now proceed to establish these properties. First, recall that (2.5) was obtained by expand- 
ing exp (— yMy'/2) as a power series in the y a and integrating term by term in (2.3). Correspond- 
ingly, (A) can be obtained by expanding exp (— yMy'/2) as a power series in yMy' and integrating 
term by term in (2.3); that is, 



#(a, M) ~ J) { —^- I e-*y'(yMyydy. 



j=0 J' J 2/^0 (2.10) 

Therefore, on using the generalized mean-value theorem, the error after m terms, E m , is given by 



Em —' 



ml 



- I e-^'CyMy')" 1 e-^ 2 ^y(£M£' < yMy'). 



Hence E m is positive if m is even and negative if m is odd. This proves the enveloping property. 
Next from (2.11). 



l±\ m C (2 12) 



I f 
'+- e- A *'(yMy') m dy, 

J y^O 



■1 , yi 



and since the right-hand member in the inequality (2.12) is numerically equal to the (/n+ljth 
term in the expansion (2.10), which is also the expansion (A), the second of the above-mentioned 
three properties is proved. 

Finally, to prove the asymptotic nature of (A), observe that in conjunction with the result 
just proved about the magnitude of the truncation error the desired result is achieved if it can be 
shown that the successive terms in (A) are in decreasing order of magnitude. It is clear from direct 
inspection that the latter terms do, in fact, stand in this relation to one another for sufficiently large 
Aa (a=l, . . . , n), and correspondingly that the series in (A), and its equivalent (2.10), do represent 
asymptotic representations of fl(a, M) for sufficiently large Ai, . . . , A, (use of the symbol ~ in the 
two equivalent formulas (A) and (2.10) being thereby justified). We shall, however, strengthen 
this by demonstrating that (A) is an asymptdtic representation of #(a, M) for sufficiently large 
values o/2 n A a 2 . This means that (A) can be used profitably even if some of the individual A a are 
small or moderately large, provided only that S^ 2 is large. Indeed, on setting 

y = r\ (|1| = D, 

L=A1\(? = 1M1' 

(1 is a unit ra-dimensional vector parallel to y), the series in (2.10) transforms to 



j£ jl Jo Jo, 



J = 



after integrating with respect to r. Here n is the surface of the unit n-sphere lying in the positive 
orthant. Denoting the angle between A and 1 by 6 and the norm of A by d, 

cos = A17(AA')* = L/(AA') 1/2 , 
d=(AA')*, 



the latter series is 



where 



_L.y (-iv-2l 
d n ^ Y dP 



(yt-1 + 2/)! 
aj Vj\ 



f(lM17(sec n + 2 J(9)^l. (2.13) 



Since a,- is a function only of M and of the orientation of A and is independent of the norm of A, 
we conclude from (2.13) that (A) represents an asymptotic expansion of /?(a, M) for positive A a 
and for large S" A 2 . 

We conclude this section by remarking that the fL*i v . . t needed in the basic expansion (A) 
(the /x*^. . . t are denned in (2.4)) can be evaluated expeditiously by contraction (cf Kendall and 
Stuart [5], sec. 13.13, p. 319) and by exploitation of the well-known fact that the product of an 
even number of normal random variables with zero expectations can be expressed in terms of 
products of covariances between the variables. Specifically, let Zi, . . . ,Z 2 j have a joint normal 
distribution with zero expectation vector and arbitrary, not necessarily nonsingular, variance- 
covariance matrix. The property 3 just stated in relation to the even order moments of a multi- 
normal distribution is 

fi[Z, . . . Z v ] = 2" E[Z ri Zr 2 ] . . . E[Zr 2 . _ ,Zg, (2.14) 

where (ri, r 2 , . . . , nj) is a permutation of (1, 2, ... , 2f) and 2" denotes summation over all possible 
ways of partitioning the set {1, 2, . . . , 2/} into j subsets, each of size two, the number of such ways 4 

being 1.3.5 (2/ — 1). To evaluate E[X*) l i . . . (X J)* n ], where i\ + . . . + i n = 2j, by contraction 

from (2.14), set 5 

|*f,a=l,2,...,i, 
X*, a = Ji + l, ii + 2, .. ., ii + ; 2 , 
: (2.i5) 

X% a = U + . . . + in-i + 1, ii + . . . + in-i + 2, . . . , 2/. 

3. An Important Special Case 

The problem of evaluating the probability that each of several standardized and equally 
correlated normal random variables shall not fall short of a specified value has been considered in 
some detail by various authors. Some applications of this problem, together with references 
relating to theoretical discussions are to be found in Ruben [11, 13]. (See also Bartholomew [1] 
for a further application.) 



3 It is of some interest that this property plays ah important role in other theoretical applications in statistics, e.g., the mathematical theory of Brownian motion 
and thermal noise. (See Wang and Uhlenbeck [17], p. 332.) 

4 As an example, for./ = 2, we have the familiar result 

£[Z,Z 2 Z3Z 4 ] = E&tZtfflZiZ*] + ^[Z 1 Z 3 1£IZ 2 Z 4 ] + E[Z,Z,\E\Z 2 Z Z ]. 

5 As an example, to evaluate E[(Xf) 2 X*X£], set Z t =Xf, Zz=X*, Z 3 = X£, Z 4 =^*, giving 

E[(X* fX*X*] = E[(X* ) 2 ]E[^*X*] + 2E[X*X*]E[XfX*l 



It may be useful to record here some of the main results available. Let M^ 1 denote an n X n 
matrix which has 1 for all its diagonal elements and p for all its off-diagonal elements, and let 
ao = (a, a, . . . , a). Then in the notation of (1.1) and (1.2), for p positive, 



F(a*Mo)=[" W~ 



P*x l 



/(%, 1) and F(x, 1) denoting the standardized normal density and distribution functions, respec- 
tively, evaluated at the point x. Formula (3.1) is a special case of a more general formula, due 
(independently) to Dunnett and Sobel [4], Das [3], and Stuart [15], in which F(a, M) is expressed 
as a univariate integral involving the standardized normal density and distribution functions when 
Pij, the correlation between X\ and Xj, is of the form p# = onajij ¥^ i), and a is arbitrary. Formula 
(3.1), specialized further by a = 0, was proved (again independently) by Ruben [7] and Moran [6]. 
More recently, Steck 6 and Owen [16] have shown that (3.1) may be extended to all p, positive or 
negative (p > — l/(n— 1)), the imaginary component of the right-hand integral being zero. Steck 
and Owen also provide valuable recursion relationships for F(a©, M ). 

Results of a rather more explicit character for F(a , M ) have been obtained by Ruben [11, 13]. 
In the first of these two papers, /?(ao, M ) was expressed as a convergent power series in a Ma<J, 
the coefficients in the series being simple multiples of generalized centroids, or geometrical 
moments, of a regular (n— l)-dimensional regular spherical simplex with common dihedral angle 
arc cos (— p); these centroids are, in their turn, expressible (Ruben [8]) in terms of the contents of 
regular spherical simplices tabulated elsewhere (Ruben [7]). In the second paper, /?(a , M ) was 
represented as an asymptotic expansion in negative powers of aoMag, the coefficients in the series 
being identical with the coefficients in the Taylor expansion of exp (— x 2 /2)K n -i(x) at x = 0, where 
K n -i{x) is the probability-mass, under a standardized (n— l)-dimensional spherical distribution, 
of a regular (n — l)-dimensional linear simplex with edges of length x and with centroid at the 
center of the distribution. The purpose of this section is to verify that expansion (A) of the current 
paper, for the special case ^(ao, Mo), agrees with that in [13], at any rate as far as the first three 
terms are concerned. (The two expansions in question are in fact completely identical.) At the 
same time this verification will serve to illustrate the use of the current expansion under more 
general conditions. 

It is readily established that M, the variance-covariance matrix of the conjugate distribution, 
has {l + (rc — 2)p} {1 + (ti— l)p} _1 (l— p)" 1 for its diagonal elements and — p{l + (n— l)p} -1 (l — p)" 1 
for its off-diagonal elements (the common correlation in the conjugate distribution being conse- 
quently — p{l-h(n — 2)p} _1 ). We find from (2.2) that the components of A are equal to (say) A 

defined by 

A = a/{l + (n-l)p}, (3.2) 

whence A > 0, if, and only if, a > 0. Assume, then, that a > 0, that is, the cutoff point ao lies on 
that part of the equiangular line x\ =xi = . . . = x n lying in the positive orthant. From (A.l) and (3.2), 



Cj = ^ + . .. + i B=i <X . . . in j • {l + (n-l)pW (, = 0, 1,. • .)• (3.3) 

Formulas (3.2) and (3.3) may now be used in formula (A) to evaluate /?(a , M ) after the (i* f 
have been determined. We proceed to evaluate C\ and Ci from (2.8) and (2.9). 

Here pt x . . . \ n is unaffected by permutation of ii, . . . i n , as is evident from the symmetry of 
the distribution. In particular, the moments occurring within any pair of braces in (2.8) and 
(2.9) are equal. Thus since 

/X2*o...o-{l + (n-2)p}/[{14-(n-l)p}(l-p)] 
and 

Mi*.o...o = -p/[{l + (n-l)p}(l-p)], 



6 Since this paper was written, a further paper by Steck on orthant probabilities (a = 0) for the equicorrelated multivariate normal distribution has appeared 
(Biometrika 49, 433-445, 1962). 



we have from (2.8) 

Ci=(/i/i 2 V..o + n(n-l)^^ 

(3.4) 

Again, from (2.9), 

C2 = /r/44b. . .o + n(n — l)-p,*io. . . o + %n(n — lj'/x&o. . . + |/i(/i — l)(n — 2)-/xf no . . . 

+ 24/1(71 — 1)(/i — 2)(/i. — 3)-/xfi 1 10 ... 0, 

where the moments occuring in C 2 are easily found (e.g., by contraction) to be given by 

M*4o...o = 3{l + (rc-2)p} 2 y, 
/x* 3 io. ..o = -3p{l-f(/i-2)p}y, 
At* 220 ...o = [{l+(Ai-2)p} 2 -h2p%, 
At*2iio ..o = -p{l + (^-4)p}y, 
M*nno. . . o = 3p 2 7, 

with y= {1 + (/? — l)p}~ 2 (l — p)~ 2 . After some reduction 

C2 = Ai{l + (A?-])p} 2 {(5+n)/2 + (/i 2 + 3n-16)p/2 + (n ;i + 2n 2 -2% + 50)/8}(l-p)- 2 /a 4 . (3.5) 

Higher order Cj may be found similarly with correspondingly heavier algebra. 
Our result is then 

F(ao, Mo) - (2tt)-" /2 {1 +(n- l)p} 1/2 (l -p)- ( " - ,,/2 e~ Wfl2 /^ J + <" - Ml 

{1-K/i-Dp}" 
— (Co - C , + C 2 - . . .), (3.6) 

where C\ and C 2 are given in (3.4) and (3.5) and Cq= 1 , the first term on the right of (3.6) represent- 
ing /ta , Mo), the probability density at the point a . The first three terms in (3.6) agree with 
those of expansion (2.23) of [13] when the function / M (a, p) in the latter paper is identified with 
the function F(a , M () ) of the present paper. Also, as remarked earlier, the first two terms of 
the series (3.6) were derived by Savage [14] (example 3) as giving a lower bound to F(a , M ), 
while the first term of the series gives an upper bound. However, from property (i) in section 2 
of this paper, all summands of the series in (3.6) with an odd number of terms yield upper bounds to 
F(a () ,Mo) and all summands with an even number of terms yield lower bounds. 



4. Scope of the Expansion and Possible Extension of Current Results 

It is clear from the form of A in (2.2) that expansion (A) for the multivariate normal integral 
will be particularly effective for large a a , a=l, . . . , n, i.e., when the distance between the cutoff 
point and the center of the distribution is large, and also under certain conditions of near degen- 
eracy when M _1 is "almost" singular and the probability in the distribution of X is correspondingly 
highly concentrated around certain linear subspaces. For example, in the special case considered 
in section 3, where degeneracy occurs for p= 1 and for p = — l/(n — 1), we note from (3.2) that ex- 
pansion (3.6) is most effective for high a and for large negative p. (In this connection, compare the 
two sets of upper and lower bounds for n = 2, ao — (3, 3) and p = ± 1/2 given by Savage [14] in his 
example 1.) 

In conclusion, we stress once more that expansion (A) is valid only for values of a such that 
aM > 0. Geometrically, this means that the cutoff point a is restricted to the interior of a poly- 
hedral half-cone bounded by the n flats xM = and with vertex at the center of the distribution. 



These flats are the regression planes (of X\ on X 2 , . . . , X n , of X 2 on X u Z 3 , . . . X n , etc.), and the 
statistical interpretation aM > is that the n residuals of a relative to the n regression planes are 
all positive. This is admittedly a rather severe restriction, and it would be desirable to have ex- 
pansions, both asymptotic and otherwise, valid for at least positive values of the components of the 
cutoff point (i.e., for a lying in the positive orthant). It is hoped that the present paper may be a 
helpful step in this direction. (Special reference should here be made to S. S. Gupta's review 
article and extensive bibliography on the multivariate normal integral and related topics in the 
September 1963 issue of Annals of Mathematical Statistics. The reader is also referred to a recent 
paper by Curnow and Dunnett [2].) The restriction aM > is, however, not serious for the special 
case 7i = 2, since it is known that the general bivariate normal integral with arbitrary cutoff point 
can be expressed in terms of the difference between two F-functions of the type discussed in sec- 
tion 3 (Ruben [9, 13]). 

That expansions of the sought for type exist (and the hunt for them therefore not chimerical) 
is readily established by methods virtually identical with those used in [11] and in section 3 of [13] 
for the equicorrelated case. In essence, this amounts to the following. The general multivariate 
normal integral is first identified with the probability-mass under a standardized spherical normal 
distribution of a polyhedral half-cone with angles between the bounding faces given by arc cos 
(— pij) and with vertex Fat a distance £ = (aMa') 1/2 from the center of the distribution. The inter- 
section of the half-cone with a unit sphere centered at V is a hyperspherical simplex with dihedral 
angles arc cos (— Po). Next, the required probability-mass in the half-cone is expressed as the 
product of (27r) -1/2 exp (— £ 2 /2) and an infinite convergent power series in £ with positive integral 
exponents, or, for 7 £ > 0, as the product of {2tt)~ 112 exp (— £ 2 /2) and an asymptotic (£ large), totally 
divergent power series in £ with negative integral exponents, where the coefficients in the two series 
are simple multiples of certain integrals over the spherical simplex. The latter integrals are, in 

fact, of the form I cos^e/l in the one case and I sec n+2j (/>rfl in the other case, where cf) denotes the 

angle between a given fixed line and that line joining V and a point on the surface of the sphere, 
while d\ is the surface-content of an infinitesimal element on the surface of the sphere. Unfor- 
tunately, however, these integrals appear to be intractable, and it is indeed clear that their evalu- 
ation will prove a formidable task, since even the special case 7 = 0, in the first type of integral, 
amounts to the determination of the content of the general hyperspherical simplex, i.e., to the 
solution of a long-standing and exceedingly difficult classical problem in rc-dimensional geometry. 
(For references to the latter problem, which is equivalent to evaluating the probability in an orthant 
under a centered multivariate normal distribution, and for a discussion of the content of regular 
hyperspherical simplices, see [10] and [12].) Nevertheless, it may prove feasible to obtain the 
coefficients for certain correlation structures. Thus the coefficients have been determined, in 
[11] and [13], for the particularly simple correlation structure specified by the property "equality 
of correlations." 
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